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Campbell and Fiske (1959) developed four criteria of 
construct validity when Measures of lore than one trait are obtained 
with mote than one lethod. In this study these criteria ace conpared 
with two other proceanres--an analysis of variance (AMOVA) aodel and 
confirmatory factor analysis- -for analyzing lultitrait-Bultiaethod 
CHTHH) data. The principle advantage of the AHOVA model is a 
convenient summairy and test of convergent, divergent and method/halo 
effects. However, the limitations of this approach are even more 
numerous than those encountered with the 3ampbell-Fiske criteria, and 
so the AROVA approach should only be used to supplement other 
procedures. Confirmatory factor analysis provides a direct test of 
the statistical significance and importance of various trait and 
method factors. The size of factor loadings provide a convenient 
description of the magnitude of method and trait effects. By 
constraining various parameters the researcher may formulate and test 
alternative configurations of method and trait factors. Cottses,uently, 
confiruatory factor analysis offers t.he advantages of both the other 
approaches without many of their limitations, and is the recommended 
procedure for analyzing HTHH data. (Author/Bil 
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Abstract 



Campbell and Fiske (1959) have developed four criteria of construct validity 
vhen measures of more than one trait are obtained with more than one method. 

In this study these criteria are compared wich tvo other procedures an 

ANCVA model and Confirmatory Factor Analysis— for analyzing multitrait- 
multimethod (MTMM) data. Despite important limitations of the Campbell- 
Fiske criteria, the usefulness of interpretations based upon the criteria, 
the heuristic value of their application, and the popularity of the method 
all dictate that it continue to be used as a preliminary inspection of - 
ynyyi matrices. The principle advantage of the ANOVA model is a convenient 
aummary and test of convergent, divergent and mathod/halo effects, Kovever, 
the limitations of this approach are even more numerous than those en- 
countered vith the Campbell-Fiske criteria, and so the ANOVA approach 
should ouly be used to supplement other procedures. Confirmatory factor 
analysis provides a direct test of the statistical significance a^d impor- 
tance of various trait and method factors. The size of factor loadings 
provide a convenient description of the magnitude of method and trait 
effects. By constraining various parameters the researcher may formulate 
and test alternative config^lrations of method and trait factors. Conse- 
quently, confirmatory factor analysis offers the advantages uf both the 
other approaches without many of their limitations, and is the recommenced 
procedure for'dhalyzing MTMM data. 
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Ccnfiraator/ Factor Analysis and A2:CVA .\nal7ses 
of Multitrait - Multir.etiod .Matrices 

Campbell and Fiske (1959) have advocated the assessment of validity by 
obtaining measures of more than one trait, each of which is assessea by -ore 
than one aethod. In the present exaaple the different traits are nine di- 
=ensions of evaluations of instructional effectiveness: the different niethods 
of assessing the traits are student ratings of teaching effectiveness and 
instructor ratings of their own teaching effectiveness. Convergent validity, 
that vhich is most typically determined, is the agreement between measures 
of the same trait assessed by two different methods— student-faculty agree- 
ment on evaluations of teaching. Discriminant validity refers to the dis- 
tinctiveness of each of the trait-factors. 

Determination of convergent and discriminant validity is based upon 
inspection or analysis of a rn'oltitrait-multimethod matrix such as the one 
shown in Table 1 (considering only the coefficients below the ma^n diagonal 
of the entire l3 x I8 matrix at this point). Correlations between different 
traits assessed by the same method appear ig^monomethod-heterotrait (the 
upper left and lower right ) blocks of the matrix. Correlations between 
different traits assessed by different methods are in the heteromethod- 
heterotrait (lover left) blocks of the matrix, fhe convergent validity 
coefficients, correlations between the same traits assessed by different 
methods appear in the heteromethod-monotrait diagonal of this matrix— the 
values in <> in Tahle 1. I; is also valuable to have the reliabilities of 
each meaaure in 'the diagonals of the heterotrait-monoaethod matrices— the 
values in parentheses in Table 1. Campbell and Fiske (1959) proposed four 
criteria for assessing convergent and divergent validity: 
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1) The convergent validity coefficients sho'lLd be statis-cically 
significant and sufficiently different from zero to warrant 
further exajnination of validity. Failure of :;his test indi- 
cates that the different niethcds are measuring different con- 
structs and implies a lack of validity in at least one of the 
metiiods. 

2) The convergent validities should he higher than the correlations 
hetve^^n different traits assessed by different methods. The 
failure of this test implies that agreement on a particular trait 
is not independent of agreement on other traits, perhaps suggest- 
ing that the agreement can be explained in terms of a generalized 
agreement that encompasses more than one (or all) of the traits. 

3) The convergent validities should be higher than correlations 
between different traits assessed by the same method. If the 
convergent validities are not substantially higher, -chere is 
the suggestion that the traits may be correlated, that there 

is a method effect, or some combination of both these possibilities. 
If the correlations between differ^»nt traits assessed by the same 
method approach the reliabilities of the traits, then there is 
evidence of a strong halo or method bisis, 

4) The pattern of correlations between different traits should be 
similar for each of the different methods. Satisfaction of this 
criterion— assuming that there are significant correlations among 
traits— would s iggest that the underlying traits are truly cor- 
related. Failure to meet this criterion implies that the observed 
correlation between traits assessed by a given method is due to 

a method or halo bias. 



Despite the intuitive appeal of the Caii^)bell-Fiske criteria, there 
are numerous potential problems in their application. Although many of 
these were anticipated by Campbell and Fiske, solutions vere not offered. 
Perhaps recognizing the dangers in the precise formulation of their cri- 
teria, these authors stated that the development of statistical treatments 
might be unnecessary or inappropriate. 

An obvious problem with the Campbell-Fiske criteria is the lack of 
specification as to what constitutes satisfactory results. The applica- 
tion to be presented in this paper, for example, involves nine traits, 
each assessed by two methods. Testing the second and third criteria 
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aione requires that each of the nine convergent validities be coinpared 
v:r.h 32 different correlations— a total of 283 coTOarisons. Besides being 
'on-ieiaiy, the likelihood of ob-^aining rejections due to sajipling fluctua- 
tions alone increases geometrically xvlth the n^anber of traits and methods. 
Th* user is left vith the task of determining either the proportion of 
failures or seme average difference betveen the con\^ergent validities 

coefficients against which they are to be com'pared. In either case, 
th<5 decision as to what constitutes a failure is arbitrary. 

An even more serious ambig^aity exists in the criteria used to assess 
di<;criminant validity. At least conceptually, Campbell and Fiske make 
clear distinctions between method variance, trait variance, and trait 
covariation. Method variance — the introduction of systematic variation 
due to a specific method of data collection— is clearly detrimental to 
dis'criminant validity, though it does not preclude the demonstration of 
either divergent or convergent validity. True trait variance (i.e., con- 
vergent validity)— the correlation between different methods of assessing 
the same trait that is independent of method variance — is obviously good, 
but it does not imply discriminant validity. True trait covariation—the 
tree correlation between different traits that does net depend upon the 
method of data collection — will increase the likelihood of failures in the 
application of the second and third criteria. However, the fourth cri- 
terion specifically tests for true trait covariation, and its demonstra- 
tion is taken as support for discriminant validity. A complete lack of 
trait covariation makes interpretation more simple, but is unlikely to 
exist in nny but the most contrived of situations (e.g., attitudes toward 
cigarette smoking and capital punishment). Trsdt correlations approaching 
unity can be unambiguously interpreted as a complete lack of discriminant 
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validity. For =ost applications, however, soae _ov to moderate true 
trait covariation is likely, and its interpretation is left ambiguous. 

The lost serious problem -/ith the Tampfcell-Fiske criteria is that 
they are based upon inspection of correlations b-tveen observed variables, 
but 3iake inferences about underlying trait and method factors. The 
validity of any set of interpretations depends upon the behavior of the 
•-mderlying constructs. This can be illust-.ted vith the problem of 
systematically differing reliabilities. Application of the criteria im- 
plicitly assumes, as recognized by Campbell and,Fiske, that each of the 
measures are equally reliable. If there are substantial differences in 
the reliabilities of different traits, or in the measures obtained with 
different methods, then failures of one or more of the criteria may be a 
function of the differential reliabilities, alone. For example, if traits 
assessed by one method are systematically more reliable than those ass-ssed 
by a second method, then the correlations among traits assessed .^th the 
mere reliable method will be higher, and give the appearance of a method 
effect. Some authors have suggested that the multitrait-multimethod 
matrix be corrected for attenuation (Heberlein, 1969; Althauser & Heberlein, 
1970). 

Similarly, the Campbell -Fiske criteria ilso assume that convergent 
'^lidities reflect the effect of shared crait variance. While this is true, 
the convergent validity coefficients can also be affected by shared methcd 
variance or a trait-method interaction. Furthermore, the existence of 
shared method variance, or trait-method interactions may act to either 
artificially increase or decrease the observed validity coefficf ei.t. A 
more detailed discussion cf the implications of these 'underlying inferences 
ia presented by Alwin (197I4). 



ERIC 



oince the deyeiopirert of the CaEpbell-Fiske criteria for assessing 
the aultitrait-multiaethod aatrix, a variety of specific statistical tests 
:"-ave been developed (Althauser 2c Heberlein, 1970; Alvin. 19TU; Joreskog, 
19^!»; Kavanagh, MacKlnney i Volins, 1971; Kenny, 1979; Loms^ i Algina, 
1971*; Schmitt, 1978; Schmitt, Coyle i Saari, 1977; Verts i Linn, 1970). 
-n the present study two of these procedures are applied, and their limi- 
tations are illustrated. The first is an analysis of variance technique 
that vaa presented by Kavan^^^h, et al. (l97l), while the second is a 
variety of confirmatory factor analysis models as elaborated by Schmitt (1978: 

In the present study, the mult it rait -mult imethod approach was used to 
validate students' evaluations of teaching effectiveness. Instructors in 
329 college classrooms were asked to evaluate their own teaching effective- 
ness on the same nine-trait instrument as their students. Previous appli- 
cation (Marsh, in press; Marsh 4 Overall, 1979; Marsh, Overall & Kesler, 
1979) of the Campbell-Fiske criteria left several questions unanswered. 
In spite of evidence for both convergent and divergent validity, there was 
the suggestion of a moderate method variance—particularly with the student 
ratings. However, confounding this suggestion were the facts that: l) the 
student ratings were more reliable than the instructor ratings (perhaps 
explaining the higher correlations among the student ratings), and, 2) the 
likelihood that the correlations among the traits (instructional evalua- 
tion factors) were true correlations rather than method or halo bias. The 
purpose of this study is to compare the conclusions baaed upon Campbell- 
Fiske criteria with those obtained from two alternative analytic procedures, 
and '00 dlscTias advantages and disadvantages of the approaches. 

Method 



During the academic year 1977-78 student evaluations were coUected 



.ally all courses offered in zhe Division of Social Sciences at the 
-nirersir/ of Southern California. Zvaluctiona vere administered s.^ortly 
before the end of the tern, generally 07 a designated student in the class 
or by a staff person. The surveys vere ccmpleted by an average cf 76*^ 
- (a range of from 5h% to lOOf.) of the students enrolled in each class. 

Instructor self evaluation surveys were sent to all teachers who had 
been evaluated by students in at least tvc different courses during the 
same term. Instructors were asked to evaluate the effectiveness of their 
ovn teaching in both courses. These surveys vere completed after the end 
of the tern, but before summaries of the student evaluations were returned. 
While participation was voluntary, a cover letter from the Dean of the 
Division strongly encouraged cooperation and guararteed the confidentiality of 
each teacher's response. Instr^actors evaluated both courses with a set of 
iteidS identical to those used by students, except that items were worded in 
the first person. They vere specifically instructed to rate their own 
teaching effectiveness and not to report how students would rate them, A 
total of 181 instractors (78^) returned self evaluations from 331 courses; 
ratings of l83 undergraduate courses taught by faculty, U5 graduate level 
courses, and 103 courses taught by teaching assistants. 

The evaluation instrument consisted of 35 items that were designed 
to measure 9 traits. Previous research, based upon a different sample of 
511 undergraduate classes taught by regular faculty, determined the 
reliability of the evaluation factors (median alpha ^ .9^^), confirmed the 
existence of the nine evaluation dimensions, and provided weights that vere 
used in calculating factor scores (See Marsh, in press; Marsh i Overall, 
1979i. The evaluation factor scores used in the present study were weighted 
averages, the weights having been derived from the previous factor analysis. 
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of standardized responses to each iter*. The evaluation trait-factors ani 
a brief description are as x''ollovs : 

LZA?:i::n/ VALUE— The extent to which students felt they encountered a 
valuable learning experience that was intellectually challenging. 

IIISpUCTCR ENThTJ3IASM«-The extent to which students perceived the 
instructor to display enthusiasin, energy, humor and an ability to 
hold interest 

C?GA2TIZATIC^ — The instructor's organization of the course, course 
niaterials, and class presentations. 

GROUP INTERACTION— Students' perceptions of the degree to which the 
instructor encorjraged class discussions and invited students to . 
share their own ideas or to be critical of these presented by the 
instructor. 

iriDIVIDUAL REPORT— The extent to which students perceived the instructor 
to be friendly, interested in students, and accessible in or out of 
class. 

3READTH CF COVERAGE— The extent to which s^ adents perceived the 
instructor to present alternative approaches to the subject and 
to eJiphasize analytic ability and conceptual understajiding. 

EXAT^INATIONS— Students ' perceptions of the value and fairness of 
graded materials in the course • 

ASSIGNMENTS— The value of class assignments (readings, homework, etc) 
in adding appreciation and 'understanding of the subject. 

WORKLOAD/DIFFICULTY— Students* perceptions of the relative difficulty, 
workload, pace of presentations, and the number of hours require<i 
by the course. 

Separate factor analyses were performed on the student and instractor 
self evaluations for the 329* classes included in this study (Marsh, in 
press; Marsh & Overall, 1979^. This analysis was performed to determine 
if similar evaluation trait-factors ur lerlie both the student and instructor 
self evaluations, and if these were similar to results previously obtained 
for a different sample of student ratings. Factor analyses of both student 
and instructor ratings confirmed the existence of the same nine trait- 
factors that had been previously identified. Each item, for both student 
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and instructor e^^aluations , loaded highest on the factor it was designed 
to measure. Loadings for items defining each factor generally exceeded 
.50, and all other loadings vere t:^T:ically less than .20. Furthermore, 
the factor leadings from both these analyses vere quite similar to those 
pre^riously obtained vith a different population of student evaluations, 
-he 315 factor loadings (35 items loading on each of 9 factors) for the 
factor analysis of instructor ratings considered in this study correlated 
.90 vith both the 315 factory loadings obtained for student evaluations 
in this study and those obtained with a previous factor analysis of a 
different sample of student evaluations; the two sets of 315 loadings from 
the two factor analyses of the student ratings correlated .9$ with each 
other. These findings Justify the assumption that similar evaluation trait- 
factors underlie both the student and instructor evaluations. 

Results 

Canrpbell-Fiske Criteria 

Application of the Campbell-Fiske criteria discussed earlier requires 
a visual inspection of the multitrait-multimethod matrix presented in 
Table 1. One of the limitations of the use of these criteria, as indicated 
by Campbell & Fiske (1959), is the implicit assumption ^>at the trait 
reliabilities obtained with different methods are comparable. This is 
clearly not the case in the present example, since student evaluations 
fbesed upon class average responses) are consistently more reliable. 
Coefficient alphas (see Table i) for student ratings vary from .87 to 
.98 (median .9^*), while these for the instructor self evaluations vary 
from .70 to .90 (meaian .82). Consequently, for each of the correlations 
presented in Table 1, the same correlation corrected for attenuation 
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J is also presented. Intarpretaticn cf the Campcell-Flslce criteria is iis- 
\ :u?sea in terns cf both corrected and *incorrected correlations. 

'^iie first Cacipbeil-Fiske criterion reajaires tha^ convergent validity 
coefticients be statistically significant and high enough to varrant 
further consideration of validity. Each of the converger: t validity 
coefficients presented in Table 1 is statistically significant, and they 
are substantial (median r « .^+5^ corrected for attenuation). 



insert Table 1 About Here 



The second Campbell-Fiske criterion requires that each convergent 
validity coefficient be higher than any other correlation in the stjne 
row or column of the same heterotrait-heteromethod block. O'his test 
requires that each of the nine convergent validity coefficients be com- 
pared to each of l6 other coefficients — a total of lUU comparisons in 
all. Data presented in Table 1 satisf^j- this criterion for l^Z of the lUU 
comparisons (for both corrected and uncorrected correlations), providing 
good support for this aspect of discriminant validity. 

The third criterion requires that each convergent vkxidity be higher 
than correlations between that trait and any other trait assessed by the 
same method. Application of this criterion to the uncorrected data 
indicates only \ rejections (out of 72 comparisons) for the instructor 
self evaluations. For the student evaluations, however, there are 30 
rejections (also out of 72 comparisons). On the surface, this would seem 
to suggest a method Ox- halo effect for the student ratings, though little 
for the instructor self evaluations. However, this interpretation is 
biased by the fact that the student ratings are consistently more reliable 
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than the instractor ratings. Torrelations involving only student ratings 
are least attenuated, while these invol''/ing only instructor ratings are 
nost attenuated. Consequently, relative? to the c-vvergent validities. 



only 5 with the instructor self evaluations. The correction for attenua- 
tion, decreased the apparent method effect and lessened the difference in 
method effect between student and instructor ratings, but these changes were 
small. 

The fourth criterion requires that the pattern of correlations among 
different traits shouJd be simil^ir for the different methods. A visual 
inspection of Table 1 suggests that this may be the case. To provide a 
more precise test, the 36 off -diagonal coefficients in the student rating 
block were correlated with those in the instructor rating block. The 
result, r » .U3, was significant at the .01 level and suggests that there 
is a sittllarlty in the pattern of correlations. This suggest? that 
there la true tradt covariation that is independent of method. 

In sunmaiy, the data provide clear support for convergent validity, 
and at least two of the criteria of discriminant validity. Student- \ 



instructor agreement on any one trait was independent of their agreement 
on other tradts. Furthermore, there was a similarity in the pattern of 
trait correlations for student and instructor ratings. There was an 
Indication, however, of some halo or method effect—particularly with 
the student ratings. 




\ 
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The AIICVA Aprroaci: 

3ased upon recent citations in the literature, this technique appesLrs 
to have been popularized by Kavanagh, MacKinney, and Wolir.s (19T1), Stanley 
(1961) demonstrated how multitrait-multimethod data could be analyzed with 
a three-factor unreplicated analysis of variance; wnen repeated neasurenents 
of cases—ratings of college classf»s in the present application—are sieasured 
over all levels of two other variables— traits and methods in this case- 
three orthogonal sources of variation can be est"^ mated. The main effect 
due to classes is a test of how well ratings in general discriminate between 
classes, and is suggested to be analogous to conver^^ent validity. It should 
be noted that this is NOT the saine use of convergent validity as that dis- 
cussed by Cajnpbell and Fiske (1959). The interaction between classes and 
traits tests whether the differentiation between classes depends upon 
traits. If it does not, then the traits have no differential validity 
(i.e., each class Is ranked the same regardless of the trait). This is 
taken to be a measure of discriminant validity. The interaction between 
classes and methods trsta "'sther the differentiation between classes 
depends upon methods, if ^. .oes, then the different methods introduce a 
source of systematic (undesirable) variance. This is taken to be a mea- 
sujre of method or halo effect. The class by trait by method interaction 
is assume^ to measure only random error (i.e.. the differentiation ber 
tween classes is assumed not to depend upon any specific trait-method 
combination). Stanley (196I) recommends that the measures be replicated 
for each auhject within a given study,. thus providing independent esti- 
mates of the three way interaction and the error term (also see King, 
Hi^ntar & Sclimidt, 198O). However, his recommendation dees not seem to ever 
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been follcved. In this model -ain effects due to traits and -lethods can 
alsc ce calculated, but tiiese are generally cf less interest, 

3oruch, larkin, Volins and MacKirjiey (1970) and Ivavexia^h, Macrjinney 
and Mclins (l9Ti; have described computational procedures vhereby the mean 
squares and the variance conponent estimates for the analysis of variance 
2:odel could be computea directly from the correlations containea in the 
srultitrait-multimethcd matrix. The computational equations for computing 
these effects are presented in Table 2. The systematic differences in the 
reliabilities of student and instructor ratings, as previously discussed, 
vill produce biased estimates of the discriitinant validity and method/halo 
effects (Bcruch, Larkin, Wolins & Mac:\inney, 1970; Schmitt, et al., 1977). 
Consequently, the MOVA procedure was also applied to the correlations 
that were corrected for imreliability (see Table l). 

Each of the MCVA effects— Convergent Validity, Divergent Validity, 
and Method/Halo bias — and their variance components are presented in 
Table 2. All three effects ere statistically significant fcr analyses 
based upon both the corrected and uncorrected correlation coefficients. 
The sise cf the discriminant validity effect (the variance compcnenc) 
was approximately twice that of the method/halo effect, v^ien the cor- 
relation coefficients were corrected for attenuation, each cf the effercs— 
except the error term—increased. However, the largest increase occurred 
for the discriminant validity effect. As was observed with the Campbell- ' 
Piske analysis, the correction for attenuation improved the disciCtminant 
validity, but did not eliminate the method/halo bias. 

Insert Table 2 About Here 
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The principal advantages of the AI^CVA xcr.el are its ease of arpli- 
cation ar.d the convenient descriptive statistics s^-jrxsriring the relative 
r.agnitude of the effects cf convers-ent validity, divergent validity, and 
r.ethcd/h^lo bias. However, tne rzcael also has r.ajor shcrtcoirings. The 
problem of differing reliabilities, vhich this approach share? vith the 
Txpabell-Fiske analysis, hes already been discussed. The assujnpflcn that 
the class by method by trait interaction contains only error variance is 
not ncrmally testable, and its violations may have var:ring influences on 
the estimation of the other effects. The model makes no provision for the 
possibility of true trait covariation or correlated method effects, and 
provides no test for their existence. Finally, many of the hevristic 
inferences that are likely to result from the application of the Campbell- 
Fiske criteria will be .lost vith application of only the MOVA analysis. 
Many of the disadvantages of the ANOVA model are shared vith the Campbell- 
Fiske analysis, but the misleading precision and simplicity of the AI^OVA' 
approach are less likely to reveal these potential problems. 

There is no clear equivalence betveen the effects estimated by the 
ANOVA model and the Camp^ell-Fiske criteria. Inspection of the computa- 
tional equation for the convergent validity effect (see Table 2), indicates 
that it is a function of the average correlation in the entire .multitrait- 
multimethod matrix. This is clearly different from the Campbell-Fiske 
criterion that is based ypon just the convergent validity diagonal. In 
particular, even if all the convergent validity coefficients approached 
unity, the average correlation in the entire matrix generally vould not. 
similarly, the ANOVA model might indicate a moderate degree of convergent 
validity even if the average convergent validity coefficient vere close 
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to zero. 

The similarity of the divergent and method/halo effects in the ABOVA 
model and the Camphell-Fiske criteria is harder to assess. Inspection of 
the computational equations for the ANOVA effects (Table 2) indicates that 
the discriminant validity and method/halo effects are a function of the 
difference between the average of specified correlations and the average 
correlation in the entire MTMM matrix. The comparisons in the Campbell- 
Fiske criteria are more specific. Fw hermore, the proportion of variance 
accounted for by the four effects in the ANOVA model—the convergent, 
divergent, method/halo, and error effects — must sum to 1.0. This means 
that an increase in the convergent effect will cause a decrease in the. 
divergent effect so long as the method/halo and error effects remain con* 
stant. This is quite different from the Campbell-Fisice approach where 
an increase in convergent validity will lead to an increase in discriminant 
validity. Similarly, when correlations in the present application were 
corrected for attenuation, the Campbell-FisLe analysis indicated that the 
Method effect was reduced (i.e., fewer, rejections of criterion 3), but 
that the method effect in the ANOVA analysis actually increased — though > 
the increase was -less than the increase in the divergent validity effect* 
The ANOVA model has no term that is comparable to the fourth Campbell- 
Fiske criterion. In fact the ANOVA model is based upon the assumption 
that traits are uncorrelated (see King, et al» , 1980) but provides no 
test of this assumption. These observations indicate that comparisons 
between the ANOVA and Campbell-Fiske analyses should be made cautiously. 

In siranary, application of the ANOVA model Indicates significant 
effects of convergent, divergent and method/halo effects. The size of 
the discriminant validity effect (the variance component) was more than 
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tvics the size of either of the other two effects. The variance conipcnent 
for this effec-c vas also increased the nost by the correction for attenua- 
tion. 

Confiraaxor:;- Factor Analysis 

The confirmatory factor analysis approach is described under a variety 
Of different labels in the literature: restricted factor analysis (Soruch 
i Violins, 1970), confirmatory factor analysis (Werts, Joreskog & Linn, 1972), 
path analysis (Schmitt, Goyle & Saari , 1977; Schmitt, 1978), and exploratory 
factor analysis (Lomax & Algina, 1979). This plethora of labels, and 
particularly the emphasis on path analysis Ca;id structural equations) is 
unfortunate. The analysis of the can be viewed as a straightforward 

applicvation of confirmatory factor analysis with a priori factors corres- 
ponding to specific traits and methods, and the ma,^or findings can be 
interpreted in much^the same way as can any other factor analysis. 

The Confirmatory Factor Analysis Model . In this study the notation, 
the specification of the model, and the actual analysis are performed with 
the commercially available LISREL F/ program (joreskog & Sorbom, 1978). 
This program embodies Joreskog* s maximum-likelihood approach to confirmatory 
factor analysis. The model used in this analysis requires the specification 
of thxee different matrices.*^ These are the LAMBDA matrix that contains 
the factor loadings > the PSI matrix that contains the correlations between 
the factors, and the THETA matrix that contains the error/uniqueness of 
i*ach measured variable. These are concertually similar to the rotated 
factor ina\rixyT>^ matrix of correlations between factors, and the communa- 
lities (actually bne minus tha commuralities) that result from common 
factor analysis. In confinnAtory factor analysis, however, the researcher 
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is able to constrain various parameters in the different aatrices in 
order to test alternative nodels. Zn the basis of tr.ese three matrices, 
a reproduced correlation matrix is determined that provides a %est fit'' 
to the original correlation matrix vlthin the constraints that are imposed 
=v the proposed model. Using matrix notation SIGMi, the reproduced" 
correlation matrix is defined as: 

SIGMA = [lambda * PSI * LAI-IBDA '] + THETA EPSILON 
In the present example, the confifa-uration for the factor loading (LAt-BDA) 
riatrix and the matrix of correlations between factors (the PSI matrix) is 
presented in Table 3. 



[nsert Table 3 About Here 



In the LAMBDA matrix, each of the 11 facl^rs (Eta 1 - Sta 11) repre- 
sents either a Method factor (Sta 1 & Eta 2), or a Trait factor (Eta 3 - 
Etc 11). The first method factor is defined by the nine instructor self 

evaluations (llm, lent Ivrk), while the scond method factor is 

defined by the nine student ratings .'Slrn, ant,,,,, Swrk). Each cf the 
nine trait factors is defined hy the one instructor and one student rating 
of the same trait. For example, the first trait factor (Eta 3) is the 
learning trait factor aijd is defined by the instructor and student ratings 
of Learning. Each of the "O" elements in the matrix represents a fixed 
parameter, while the other 36 elements are free and will be estimated. 

In most of the models to be discussed— with jome notable exceptions, 
the factors are oblique (correlated). The correlations among the 11 factors 
appear in the PSI matrix (see Table 3). Each of the elements in the FSI 
matrix represents a correlation between two factors; for example, rlO.ll 
represents the correlation between the Assignment factor (Eta 10) and the 
Workload/Difficulty factor [Eta 11). Elements of the matrix 
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ths.t begin vith. r vere free arJ estimated by tlae program; the "O'* 
ele-nients vere fixed tc be zero; and the diagonals vere fixed to be 1.0. 

Tha IISHZL progr^ attempts to minL'^iize a maxixum-lilcelihood loss 
furction that is based upon differences between rhe original _ad repro- 
duced correlation matrices, and provides an overall chi-sq.uare test of the 
gocdness-of-fit of the proposed model. As described by Joreskog ^Joreskog 
i Sorbom, 1973), it also determines a test of identification, 
an'-rrptomatieally efficient estimates of each free parameter in the pro- 
posed model under the assumptions of multivariate normality, estimates of 
the standard error of each fitted parameter— allowing a statistical test , 
of its difference from zero, and additional information that is helpf^ol 
in determining what changes in the proposed model would provide a better 
fit to the data (see Maruyama & McGarvey, 198O, for furt_ier discussion). 

The minimm condition for fitting the complete model (Alwin, 197U; 
Werts, et al., 19T2) is that there be at least three traits and three 
mettiods. This^ means that> without making any further assumptions (i.e., 
constraining more parameters to a fixed value), the most unrestricted form 
of the model is not identified and cannot be tested. On the^ basis of both 
substantive (Boruch & Wolins, 1970) and practical (Althauser & Herberlein, 
1970) considerations, the correlations between traits and methods were 
set to zero. However, the model was still not identified.^ In order 
to obtain a testable model, the reliability of the student and instructor 
rat:Lnga (coefficient alphas baaed upon the itema that define each of the 
fac':orsi) were computed and used as a basis for determining the values of 
THE:A Cerror/uniq,uene88 components). Preliminary analysis Indicated that 
thia resulted in a very poor fit to the data, suggesting that each factor 
have a unique component as well as error. Consequently, the I8 
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variables -^re entered into a standard factor analysis proced^ore ::;ie, et 
al. , 19^5] and an 11 factor solution vas ietei-inined. The Conmunalities 
resulting from this analysis (see ::i-, et al. , 1975, pp. ^"5-477) vere 
then used to determine an estimate of the THETA elements. This procedure, 
which pro^rldes an estimate of the combined uniqueness and unreliability, 
pro^rLded a much better fit to the data. Consequently in order to circ^jm- 
vent the identification problem, all the THETA elements vere set at a 
value of 1 minus the communality of the variable. This same se^ of values 
vas used for each of the models to be discussed. Consequently, the most 
general model to be considered in this study is one in which correlations 
between methods and traits are fV:<:d to be zero, and the values of THETA 
(error /uniqueness components) are predetermined. 

The Goodness of Fit of the Model . The LISRZL program pro\'ides a 
chi-square test of tae overall goodness-of-fit , but the test is dependent 
upon the sample size. A reasonably good fit to the data will produce a 
statistically significant chi-square value if the sample size is large, 
while a poor fit based upon a small sample size may not result in a 
statistically significant chi-square value. Alternative indices of fit 
(Schmitt, 1978) include the ratio of the chi-square to the degrees of 
freedom, the average difference between the reproduced and original 
correlation matrix, and a reliability coefficient developed by Tucker and 
Levis (1973). The reliability coefficient is defined ast 

r « (Co - Cm)/ (Co - 1) Where: 

y * ' chi^square/df ratio for a null model. 

Cm ■ the chi-square/df ratio for the tested model, 

1 . « the expected value of the chi-square/df ratio 
This coefficient scales ttlTchi-square goodness-of-fit value along a 
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soale that varies froa zero ,the null rodel) to 1.0, though ^aiues greater 
than 1.0 are possible. The n'ull -oiel generally consists of specifying 
3:3MA to be a iiagonal matrix, testing the assur.ption that the nieasured 
variables are 'unoorrelated. Tucker and Levis suggest a value of .90 or 
higher provides an adequate fit to the data. Their ccefficiert provides 
an index of the proportion of the variance that is explained by the model 
rather than a statistical test of its goodness-cf-fit. For exainple, a 
aodel that is tested vith a small number of cases (e.g., less than 50 cases) 
aay result in statistically insignificant differences from the observed 
data (based upon the chi-square test) and yet only have a Tucker-Lewis 
reliability coefficient of .50. This suggests .that while the proposed 
model fits the data in a statistical signmcance sense, the test was a 
very weak one and there may be many possible models that would do as well. 
Alternatively, a model that is tested with a largenumber of cases may 
have a Tucker-Levi s e reliability of .99 and still have a significant 
chi-square value (see B'--ntler i Bonett , 1980, for further discussion). 

The estimatea parameters for the general model (Model I) described 
in this section are presented in Table k. The chi-aquare value for this 
model is statistically significant, but the chi-square/df ratio was only 
2.38 and the Tucker-Levis reliability coefficient is .98. This indicates 
a good fit to the data. 



Insert Table 1* About Here 



Inspection of the values suggest that each, of the nine trait factors is 
well defined, that there is substantial method variance associated with 
the student ratings and sdme associated with instructor self-evaluations, 
and thut the traits are moderately correlated. 
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Testing; Alternative Models . Tcmparisons of tvo ^^ested r.ciels :an ce 
r:ade by taking the difference in their two chi-square valves and testing 
this against the difference in the degrees of freedcn ^.Ber.tler Bonett, 
1930; PCenny, 1976; Schnitt , et al. , 19^^). For exa^Tiple, one of the alter- 
native formulaticns of Model I postulated that the 36 correlations betveen 
the nine trait factors (in the PSI matrix) are really zero (Model 7 in 
Table 5). Anatlysis of this model produced a chi-square value (5^^3,6 with 
13^ degrees of freedom — see Table 5) that was necessarily larger than the 
value obtained vdth Model I (233.6 vith 9^ degrees of freedom); the two 
chi-squares would only be equal if the estimated parameters in Model I 
were exactly equal to zero. Since the difference in the two chi-square 
values (310.0) assessed against the difference in degrees of freedom (36) 
is statistically significant and substantial, the analysis argues for 
Model I. 

In order to make more precise tests of the data, a series of alter- 
native models were derived and their ability to fit the data (using the 
Tucker-Lewis coefficient as an index) was examined. These models are sum- 
marized in Table 5— including the general and r.ull models — along with 
their chi -squares, degrees of freedom, chi-square/df ratios, and Tucker- 
Lewis reliabilities. Alternative models considered the consequences of 
eliminating, one or nore of the trait factors, eliminating one or both of 
the method factors, or constraining some of the correlations between these 
factors to be zero. For example, the student method factor was eliminated 
(Model III in Table 5) by setting all the factor loadings for this factor 
(the Eta 2 factor in the LAMBDA matrix) equal to zero and setting all the 
correlations (in the PSI matrix) involving this factor—including the 
diwfonal element— equal to zero. However, this model provides a poorer 
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fit to the data than Model I. Similaily, the eliMJiation of the instructor 
method factor also 'produces a poorer fit than does ^he. general model, but 
a better fit than when the student method factor was eliminated. This 
shows that the student method factor is more important than the instructor 
method factor. 

Insert Table 5 About Here 



In summary the analyses of these alternative models indicates that: 

(1) Substantial portions of the variance in the data were accounted for 
by both the different traits and the different methods. However, 
exclusion of the trait factors was far more detrimental to the fit 
of the model than was exclusion of the method factors. 

(2) The elimination of correlp'Lions among the traits produced a poorer 
fit to the data, indicating that the underlying traits considered 
in this study are truly correlated, 

(3) While there was suljstantial method variance in both the student 

and the instructor ratings, elimination of the student method factor 
was more detrimental than was elimination of the instructor method 
factor. This indicates that there is more method variance in the 
student r'^tings than in the instructor self evaluations. 

A classic problem in factor analysis is the determination of the 
number of factors. Reaeaurchers typically resort to heuristic guidelines. 
In the present application, a precise statistical, tesit is used to explore 
the consequences of combining two or more factors Csee Joreskog , 19TU ) . 
The Organization and Breadth of^£!overage trait-factors vere consistentljr 
aaong the moat highly correlated in each of t1r*e different models (e*g., see 
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itrix in Table 4) 



?'-irthermore, ^Lese tvo factors seem conceptually 



related as veil. Tonsequently , an eight-trait solution vas tested that 



Organization trait-factor, and allowing the Organization items to load 
on the Breadth of Coverage factor. However, the results of this model 



than did the nine-trait model. This implies that the best description 
of the data requires all nine trait factors,, or at least that these two 
should not he combined. The ability to test the statistical and practical 
impact of combining traits offers an important advantage for the confirma- 
tory factor analysis approach, particularly when research does not begin 
with a well established factor structure. 

Descriptive Statistics , The values in Table U can also be used to 
derive descriptive statistics similar to those obtained ^with the ANOVA 
model, and to assess the adequacy of each of the measures separately. 
Loadings in the LAMBDA matrix can be interpreted in much trie same way as 
with common factor analysis; high loadings of items on a trait or method 
factor supports the existence of the factor, ^rait and method variance 
components for the general model (as depicted in Table k) can be estimated 
by squaring the factor loadings in the LAMBDA matrix (Joreskog, 197U), and 
are presented in Table 6. 

The trait variance in every measure, both student and faculty ratings, 
was suhMtantial and statistically significant. The average trait variance 
across all measures was approximately twice that of the average method 
variance. The trait variance in the student ratings was somewhat higher ^ 
than for the faculty self evaluations* However, the faculty self 



'combined these two factors. 



This vas accomplished by eliminating' the 



Model X — see Table 5) produced a substantially 



poorer fit to the data 
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evaluations had little metliod variance 'except for the learning/Valae 
factor), vhile that observed v^th the student ratings was substan^^ial. 
. :ne factor, learning /Value, had substantial method variance for both 
student and instructor ratings. For instructor ratings o^ leerr in^/Va"^ ue 
there vas substantially T.ore .-nethod variance that trait variajite. Sira- 

larlv, there vas more net hod variance in the student ratings of Examinations 

I 

tham there was trait variance. 



Insert Table 6 About Here 

It must be emphasized that evidence for the existence of a particular 
trait or method should be based upon the size of the factor loadings in the 
LAMBDA matrix (e.g.. Table k) or the variance components based upon these 
loadings (Table 6). Some researchers (e.g., Schmitt, et al., 19TT) have 
incorrectly suggested that support for the discriminant validity sho-old be 
based upon the correlations among the trait-factors (in the PSI matrix) 
rather than the factor loadings. However, significant correlations in the 
PSI matrix merely means that the underlying trait-factors are correlated 
in a manner that is independent of the method of data collection. This 
situation is actually related to the fourth Campbell-Fiske criterion 
(that the pattern of correlations among traits is similar for each of the 
different methods), and they interpret this as evidence supporting the 
discriminant validity of the measurea. As with, the interpretation of 
other obliciue factor analyses, it is only when correlations between traits 
become extreme that the researcher need be concerned about the distinctive- 
new of the different factors. As in the present application, the cor- 
relations among factors may be (juite consistent with the substantive 
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nature of the lata. 

Applica/tion of the atrix eq^uation ;eq,uation 1) or the equivalent 
tracing rule (Schmitt, 1978; Kenny, 1979) allovs the decomposition of each 
reproduced correlation into components that are due to trait variation, 
method variation, and trait-method interactions. As previously discussed, 
one of the limitations of hoth the Camphell-Fiske and ANCVA techniques is 
thax they make inferences ahout latent or unobserved variables that are 
based upon observed relationships. For example, the true trait variation 
the convergent validities may be systematically increased or decreased, 
depending upon the influence of the method or trait-method interactions. 
A computationed equation for decomposing each reproduced correlation into 
distinct components ia presented in Table 7. Application of this decompo- 
sition for each of the reproduced correlations indicated ' at there vas 
verjr little method varfation in anjr correlations other than the correlations 
among the student ratings. 



Insert Table 7 About Here 



Summary of the Confi rmatory Factor Analysis Approach . The analysis 
of matrices can be viewed as an application of confirmatory factor 
analysis. The matrices upon which this analysis based—except for the 
constraints used to define various models—are familiar to users of 
factor analysis, and the interpretation of the results is similar to 
the interpretation of common factor andL^./aes. However, the ability to 
constrain varioua parameters ailovs the formulation and testing of various 
descriptions of the T^itent trait and method factors. The "goodness of 
fit*^ of the various models and their parameter estimates (e.g., fac^ 
loadings) provide a direct test of the existence of various trait and 
method factors • 
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riscuasion 

The purpose of this study was to compare different techniciues lor 
analyzing aultitrait-multiiaethcd matrices. In particular, the conclusions 
tased upon the Campbell-Fiske criteria vere compared vith those generated 
by the AIJOVA model and the set of coufirmatory factor analysis models. 
At the most general level each of the different approaches showed good 
support for both the convergent and divergent validity, but also indicated 
soiae method or halo bias. The Campbell-Fiske criteria, through inspection, 
showed that agreement on any one trait was relatively independent of 
agreement on other traits (criterion 2), that the method variance was more 
pronounced in the student ratings (criterion 3), and that there wa^ evidence 
of trait covariation that was independent of method (criterion k) . The 
AirOVA model indicated that the variance component for the divergent vedidity 
effect was approximately twice that for the method/halo effect. Confirma- 
tory factor analysis provided pr^ciae tests of. each of the observations 
generated by the Caarpbell-Fiske criteria, provided* a statistical sumaaiy 
similar to that generated by the ANOVA model, and also estimated separate 
method and trait variance components for each of the different measures- 
Confirmatory factor analysis also provided tests of additional hypotheses 
that w^re not testable with either the Canqpbell-Fiske or the ANOVA 
approaches . 

As previously diacusse^^there are several important limitations of 
the Caa^bell-Piske approach, to analysis of multit>'ait-mxiltimethod matrices. 
The most iiiq?ortant are: l) the informal nature of criteria and the lack 
of Clear statements of what constitutes satisfactory results; 2) the 
inability to provide and incorporate information about the reliability of 
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of the measures (particularly if reliability estiiaates are not available); 
3) the cumbersome and unvieldy number of comparisons that must be made for 
large problems; U) the ambiguity between trait variance, trait covariance, 
and method variance; 5) the reliance on observed variables for making 
speculations about latent factors; and 6} the lack cf any meaningful 
summary statistics that describe the data, ^ 

Despite these problems, the Campbell-Fiske criteria performed well 
in the present application. Each of the descriptive speculations based 
upon this analys'is were confirmed with the more rigorous tests of 
alternative LISREL models. The approach, while lacking rigor, does pro- 
vide an important initial assessment of convergent and discriminant validity, 
and method/halo biases. The popularity of the method, the ease of its 
application, the heuristic appeal of the criteria, and the usefulness of 
interpretations all dictate that these criteria continue to be used for 
the preliminary inspection of any multitrait-multimethod matrix. 

The limitations with the ANOVA model, though perhaps less apparent, 
are more numerous than those encountered with the Campbell-Fiske analysis. 
The principal advantage in the use of this approach is that it provides 
a convenient summary of the relative magnitude of trait and method effects 
and a test of their statistical significance. However, -^he appropriate- 
ness of the test and the summary depend upon many of the same underlying 
assumptions that were discussed with the Campbell-Piske analysis, and the 
detailed inspection of the multitrait-multimethod matrix reauired by the 
Caapbell-Fiske approach will often provide an in .^cation of problems that 
may be overlooked in the deceptively simple summary statistics resulting, 
from the ANOVA analysis. Finally, many of the heuristic speculations that 
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result in the application of the Campbell-Fiske criteria vill "be lost if 
only the AIICVA nodel is used. For example, applicaticn of the Campbell- 
Fiske criteria indicated that there vas considerably mere method/halo 
effect in zhe student ratings than in the instructor ratings, that there 
vas true trait covariation amci.^ the different traits that was independent 
of method, and t'xiat T:he correction for unreliability reduced the methci/ 
halo effect in the student ratings. None of these findings could have 
be^n identified by the ANOVA model to analyze multitrait-multimethod 
matrices. It does, however, provide useful summary statistics that can 
supplement the Campbell-Fiske criteria. 

The limitations in the application of the LISREL models stem 
primarily from the difficulty of use. Paul Lohnes (1979, p. 33^), an 
influential reseaurcher and textbook author in the application of quantita- 
tive analysis, recently stated that "LISREL is a complex and expensive 
fitting and testing machine to which the author does not have access." 
The key points seem to be the complexity, the expense, and the xack of 
availability. The LISREL program ie commercially available for a rather 
nominiLL charge, so availability is iiot a critical problem. Complexity 
represents a large initiaO. hurdle that must be overcome, in much the same 
way that the cwiplexity of multiple regression was a limitation of its 
of its application before the publication of the Draper & Smith (1966) 
text. Similarly, the complexity of LISREL will become less of a problem 
aj the technique becomes more widely known and applied. The expense— in 
terms of con^uter time— ds an important limitation that probably will not 
be easily resolved. While many finite problems — the kind that are likely 
to appear in textbooks— can be solved with small amounts of computer time, 
exploration of large scope problems quickly become very expensive. This 
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vill be a particularly important limitation to the novice user vho may "be 
forced to use consileracle amounts of computer tir.e in formulating the 
problem. 

Beyond these general difficulties in using LISREL, its application to 
analysis of multitrait-multimethod data also imposes other limitations. 
In order to test a model vith free parameters for all of the off-diagonal 
values in the PSI matrix (correlations between the factors) and the 'JHETA 
matrix (the uniq.ueness /error variances) a minimum of three ^"raits and 
three methods are needed. However, as demonstrated in this study, a 
Variety of constraints can be imposed that allow testing of an alternative 
models. Even when there are an ade(iuate number of traits and methods, 
it is necessary to have a large number of cases in order to provide strong 
tests of alternative models and to obtain high Tucker-Lewis reliability 
coefficients. This is particularly important when the researcher se- 
quentially develops alternative models on the basis of prior analysis of 
the same data. This problem, taking advantage of chance variation that may 
be specific to the particular data being considered, is not unique to 
this analysis, and the best control for the problem is to cross-validate 
the findings. 

Despite these limitations, confirmatory factor analysis is clearly 
the superior method to use in the analysis of multitrait^ultiffiethod data. 
In summary, soma of its advantages ^are: 

1) it teats inferences that are baaed upon the underlying latent 
variables rather than relationships between observed variables; 

2) it distinguishes vauriaace due to traits and methods; 

3) it allows comparison of a variety of alternative formulations 
of the basic model and an overall test of the goodness-of-fit 
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for each, proposed Mdel; 
^) it provides a separate statistical test of eacL estimated parameter 
against the null hypothesis of a zero coefficient; 

5) it provides convenient suninar/ statistics of the amount of trait 
and Mthod variance in each separate measure, in each set of 
Pleasures, and for all the data combined; 

6) it allows the decomposition of each reproduced correlation in 
components that are attributable to trait and method effects ; 

T) it provides estimates of the reliability of each measure that are 
incorporated into the analysis; 

8) it provides an empirical test for the existence of correlations 
among traits, among methods, and between traits and methods; 

9) it provides an empirical test of the number of trait-factors and 
method-factors that provide the best fit to the data. 

these advantages, particularly when compared to those of alternative 
techniques, demonstrate the importance of ucmg LISREL modeling in the 
analysis of oultitrait-multimethod data. 
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Footnotes 

The authors vish to acknowledge William McGarvey ard Robert Cudeck 
for their comments on an earlier draft of this paper, aad for their help 
in the application of LISREL. 

1— The most general model and each of the^ alternative models could also 
be specified in terms of x- variables instead of y-variables. Other 
specifications of the most general model (e.g., permitting correlated 
errors, etc.) are also possib. . The particular specification used 
in this study is the one. most generally used by other researchers. 

2~A necessary, but not sufficient, condition of identification is that 
there are at least as many observed correlations as free parameters. 
This is not a sufficient condition, , since there may be overriding 
constraints (Kenny, 1979). The LISREL program, however, checks for 
identification (See Joreskog, 1978; Joreskog & Sorbom, 1978; for 
further discussion) and generates an error message when the pro- 
posed model is not ident^.fied,. 
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ST^JDFWr EVALUATTOIi FACTORS 
LBARM BRTHO OHGAH GPDUP INDIV BROTH EXAHS ASIGN HFKLD 



TABLE I 

fluUitr.it-BttltUethod Hatrir: Correlations Petvi^on Student and Faculty solf EMluations in 329 Courses^ 

liSTROCTOR SELF-ETALDATTOSI FACTOHS 
iMTiOCTOt 5«LF lEABV EktHO OSGAM GBOUP I|IOI\ BBDTH EXAflS ASIGH MRKLO 
BfAlOATICi FACTORS 

UARVIVG/f ALOE (R30) 

EIITR05IASR 347 

ORGimZATlOV 1119 

GHOOF IRTE8ACT 17 

TWOIflD ^APPRT -85 

OREAOTH 152 

CXAaiSATlORS -9 

ASSlGlinBNTS 319 

VRRLD/DIFFCLTf 39 



286 


117 


ia 


-70 


127 


-8 


243 


30 


<405> 


214 


171 


192 


29 


256 


IB 3 


(8 20) 


10 


30 


-19 


124 


80 


-8 


-10 


99 


<476> 


132 


46 


31 


149 


89 


13 


(740) 


-1(17 


72 


129 


262 


167 


116 


-12 


-4 1 


<25a> 


'233 


-53 


87 


17 


35 


-180 


(900) 


20 


107 


85 


46 


-92 


82 


-13 


-34 


<454> 


131 


-3 


-7 


-23 


93 


24 


(820) 


-14 


147 


218 


59 


-121 


-16 


36 


-4 


<250> 


-142 


55 


ia9 


163 


123 


-17 


(840) 


203 


85 


-41 


93 


-7 


67 


-24 


-1fl8 


<367> 


-88 


101 


3il9 


102 


186 


254 


(760) 


218 


91 


-3 5 


-32 


86 


-137 


-36 


2 


<135> 


-11 


232 


58 


288 


111 


299 


(700) 


214 


77 


-90 


-3 


-47 


-24 


95 


-19 


-13 


161 


-116 


78 


-53 


125 


306 


(700) 


16 


•98 


-48 


-81 


3 


21 


-56 



207 
26 
18 
86 
-6 
44 

-16 
<356> 

122 



-58 
-29 
40 
1 

32 
-31 
123 
225 
<539> 



STOOXHT 
BfALHATTOW FACTORS 



IRSTROCTOR SELF-ETALUATIOR FACTOHS 
LEARR BRTRO ORGAN GROUP IKOIV BRDTfl EXAttS ASIGM RRKLD 



STUDENT EVALUATION FACTORS 
LEARH EHTUO ORGAN GR3UP iNDIV BRDTii EXAHS ASIGN kTKLD 



LCARRlRG/TALaS 


<456> 


112 


-15 


89 


-137 


104 


-41 


94 


20 


(950) 


EUTHDSIASH 


240 


<537> 


-48 


-14 


-19 


-8 


-32 


-109 


-120 


476 


OiGARIZATTOR 


195 


151 


<306> 


-37 


41 


76 


102 


-4 


-59 


562 


CPOUP III'»»RRACT 


213 


52 


-239 


<484> 


-4 


-27 


-159 


-57 


-98 


382 


I»orvrO RAPPRT 


33 


34 


-63 


141 


<282> 


-209 


-42 


-30 


3 


233 


BRBADTI! 


290 


169 


104 


-4 


-162 


<413> 


2 


117 


26. 


522 


EXARIVATIORS 


208 


102 


20 


-7 


63 


-100 


<166> 


-21 


-69 


512 


AS5IGIfftnRTS 


237 


30 


21 


94 


-7 


50 


-19 


<443> 


151 


557 


RR KiD/DTFFCLTT 


-69 


-^5 


50 


1 


37 


-36 


151 


289 


<691> 


64 



455 528 369 222 494 a8 1 521 58 

(960) tl97 305 350 339 /419 248 17 

526 (93C) 215 334 562 I57I 345 -46 

314 225 (980) M20 165 1 34 1 305 -5a 



364 353 a33 (^60) 156 | 504 288 80 
357 601 172 164 (940) 1 334 403 178 



443 615 35/ 534 357 ( (930) i»23 -23 
226 173 321 307 tl33 I M57 (920) 204 
18 -51 -59 87 1P6 -26 228 (870) 



«0TE: enclo8.a lo ( , In tho diagonals of the upper left and lo.er right n.trices (the hoterot cait-.ono^oht.d . ^rlas) 

t o .elUbUity (coefficient alpha) coefficients. Taluos enoloaod in < > In the diagonals oflo.er left 4nd upper 
right Mtrlcie. (the het.rotrait- heteronethod natrlcea) are the convergent ralUity coefficients. All coefficient. 
belo« the .axn diagonal of tho entire yfi t n .ntrlx hare boon corrected for unreli,hlllty. Cocielatlona (presented 
•Itfcout docital points) greater thon 100 (I.e., .10) are statistically significant. 



38 



39 



Cooputational E^matlona and Eosulta of tho AN0V4 
Analysis of a Multitralt - BiHti.etho.J mtrlx 



Class (ci 



Cuaputations for Subs of Sciuares 

s s_ 

(rt) 



an<l Bsti»atea Variance Coiponent 



DF 



pooencs 
^friaoce Coaponent 
(NSc-flScta)/Da 



Traits 
(Oiscriainant Yalidity) 

Class X *!etboi1s 
(nethod/Naio Bffiict) 

C X T X n 
(error) 



Hna (rv - rt) 
irna {rf - rt) 
Nnafl-rv-rftrt) 



(»-l) (n-1) 
(M-1) <a-l) 
(N-l) (n-1) (a-1) 



(NSct*flScta)/)i 
(nSca-nseta)/a 
RScta 



m m nuK^dr of iiff.^i^nt ' 
rt • «iVQrarjo carrolJl" 



>thod.s 




of 



souarB af 



fsults for Oncorre^ea and Corrected Correlation Hatricies 
OMCORRECTED *CORBEliTTO|iS 



Class 328 
{Convergent) 

C.x Tr4tt 262i| 
(t^ivergcnt) 

S X aethod 328 
(H^thod/«alo) 

C X T X « 262» 
{Error) 



I CORRBUTIMS C08RECTED FOR ITTBNUATIGM 

!i!S!-J Hs r vabcp 



1000.55 
3016.25 

123/^.113 



3*078 e.SU** 0. 145 I 
1.1«I9 2.aa** 0.3«0 I 
«.27»* 0.171 I 
0,^70 I 



2.011 
0.<)70 



bods 



1085.89 3.311 8. 17** 0.162 

3101.05 1.182 2.97** 0.392 

690.5a 2.099 5.27** 0.189 

10a^5l 0.398 — 0.398 

22*50? SJ"0*^^3; Bv«0,707; Rf«0.3.'>0 
II-329 Ciass*.fi; o«9 craifs; 5.2 a«t 



bods 



(» 

t J. 

tt 
i 



I ^« 

M 
o 



Table III 

Configuration of the LAMBDA and P3I Matrlcea in the GENERAL MODEL 
LAMBDA (Factor Loading Matrix) 



Student 



Inst 

Method 

Factor 



Stdt 

Method 

Factor 



eta 1 Eta 2 



Inatructor« 



Laarning/Value 

Inthuaiaaa 

Organitatioa 

Oroup Interaction 

Individual Rapport 

Breadth 

Exaai nation* 

Assign«enta 

Workload/Difficulty 

Learning/Value 

Enthuaiaaa 

Organisation 

Oroup Interaction 

Individual Rapport 

Breadth 

Bxaai nations 

Assigniidnts 

Workload/Di f fi culty 



Urn 
lent 
lorg 
Igrp 
lind 
Ihrd 
lexa 
lasg 
Ivrk 

0 

0 

0 

0 

0 

0 

0 

0 

0 



0 
0 
0 
0 
0 
0 
0 
0 
0 

Slrn 
Sent 
Sorg 
Sgrp 
Sind 
Sbrd 
Sexm 
Sasg 
Svrk 



Lrn 


Ent 


Org 


Grp 


Ind 


Brd 


Exm 


Trait 


Trait 


Trait 


Trait 


Trait 


Trait 




Factor 


Factor 


Factor 


Factor 


Factor 


Factor 




£ta 3 


Eta U 


Eta ^ 


Eta 6 


Eta 7 


Eta 6 


Eta 9 


Ilm 


0 


0 


0 


0 


0 


0 


0 


lent 


0 


0 


0 


0 


0 


0 


0 


lorg 


0 


0 


0 


0 


0 


0 


0 


Igrp 


0 


0 


0 


0 


0 


0 


0 


lind 


0 


0 


0 


0 


0 


0 


0 


Ibrd 


0 


0 


0 


0 


0 


0 


0 


lexffl 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Slrn 


0 


0 


0 


0 


0 


0 


0 


Sent 


0 


0 


0 


0 


0 


0 


0 


Sorg 


0 


0 


0 


0 


0 


0 


0 


Sgrp 


0 


0 


0 


0 


0 


0 


0 


aind 


0 


0 


0 


0 


0 


0 . 


\o 


Sbrd 


0 


0 


0 


0 


0 


0 


0 


Sexm 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


' 0 


0 


0 


0 



PSI (Correlations Between Factors) 



Asg 

Trait 

Factor 



0 
0 
0 
0 
0 
0 
0 

lasg 
0 
0 
0 
0 
0 
0 
0 
0 

Satig 
0 



work 

Trait 
Factor 



Eta 10 Eta 11 



0 
0 
0 
0 
0 
0 
0 
0 

Ivrk 
0 
0 
0 
0 
0 
0 
0 
0 

Svrk 



• 


Inst 

Method 

Factor 


stdt 

Method 

Factor 


Lrn 

Trait 

Factor 


Ent 

Trult 

/actor 


Org 

Traits 

Factor 


Crp 

Trait 

Factor 


Ind 

Trait 

Factor 


Bre 

Trait 

Factor 


Exm 

Trali 

Factor 


Asg 

Trait 

Factor 


VorK 

Trait 

F'\ctor 




Eta 1 


Eta 2 


Eta 3 


Eta ^ 


Eta ^ 


Eta 6 


Eta 7 


Eta 6 


Eta 9 


Eta 10 


Eta 11 


Instructor Method 


1.0 






















Student Mftthod 


rl.2 


1.0 




















Learning/Value 


0 


0 


1.0 


















Enthusiasm 


0 


0 




1.0 
















Organisation 


0 


0 


r5.3 




1.0 














Oroup Interaotiou 
Individual Rapport 
Breadth 
Exaiti nations 
Assigusents 
Workload/Dif 1*1 culty 


0 
0 
0 
0 
0 

0 r 


0 
0 
0 
0 
0 
0 


r6.3 
r7.3 
r8.3 
r9.3 
rlO.3 
rll.3 


r6.l» 

rd.li 

rlO.li 
rll.i* 


r6.5 
r7.5 
r8.5 
r9.5 
rlO.5 
rll.5 


1.0 
r7.6 
r6.6 
r9.6 
rlO.6 
rll.6 


1.0 
r8.7 
r9.7 
rlO.f 
rll.7 


1.0 
r9.8 
rlO.8 
rll.8 


1.0 
rlO.9 
rll.9 


1.0 
rll.lO 


1.0 



to m"d^ rro'J'^XV^aV"^.*'^^^.*^^ ^-^--^ value, r.p„,e„t variance 'ttrlb.table 

1- tU. I„ other •ppllc.?lon":/ J « l^lt : " ; usl^nr^Jr^"""^"' Independently and rlx.d 



Ia) 
00 



p. 

I 

tr 

o 
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leanslng/Valuft 

DithusiasB 

OrgamlzatloQ 

Grcup Interaction 

Individual Rapport 

Ereaith 

ExanlrmtlotAt 

Atsl^nsitntfl 

Vorkload/Dirriculty 

Uamlnf/Value 

Eothuslaam 

Or^.lxatlGa 

Group Interact Ion 

Individual lUipport 

Braadth 

ExiAlaatloni 

Atilsnmanti 

Workload/Difficulty 



Xnatmrtor Mathod 

Studant Mathod 

UamiAf/Valua 

Enthuaiaaa 

Organlxatiod 

Oroup IntaractloQ 

Zodividual Rapport 

Brtadth 

Ixaal nations 

AaaifOMnta 

Workload/W ff iculty 
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Table :v 

Configuration of tJ^e LA;I3DA and ?SI Matrices in the GENEPJIL 
LAI'SD' (r actor loaait.^ Xatrlx) 



Inst 

Method 

Factor 

Eta 1 

-0.666 

-0.067 

0.17U 

0.055 

0.093 

0.236 
-0.156 
-0.097 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

CO 

0.0 

0.0 



Inat 

Method 

factor 



Stdt 

Method 

Factor 



Lm 

Trait 

lector 



Hta 2 Eta 3 



0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.697 

0.571 

0.729 

O.bdO 

0.612 

0.567 

0.829 

0.615 

0.101 



0.525 

0.0 
0.0 
0.0 
0.0 
0 r 
0,0 
0.0 
0.0 

0.719 

0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 



Ent 

Trait 

Factor 

Eta h 

0.0 

u.63d 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.730 

0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 



Org 
1'ralt 
Fee tor 

Eta 5 

0.0 
0.0 
0.523 
0.0 
0.0 
.0 
.0 
.0 
.0 
.0 
.0 
0.735 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 



Crp 

Trait 

Factor 



Ind Brd 
Trait Trait 
Factor Factor 



E.3D 

Trait 

Factor 



Asg 

Trait 
Factor 



Vork 

Trait 

Factor 



E:ta 6 Eta 7 Eta 8 Etfc 9 Eta 10 Eta 11 



0. 
0. 
0. 
0. 
0. 
0. 



0.0 

o.u 

0.0 
0.732 
0.0 
0.0 
0.0 
. 0.0 
0.0 
0.0 
0.0 
0.0 

0.61*8 

0.0 
0.0 
0.0 
0.0 
0.0 



PSI (Correlations Between Factors) 



Stdt lm Sttt 
Method Trait Trait 
Factor Factor Factor 



Org 

Trait 

Factor 



Grp 

Trait 

Factor 



0.0 
0.0 
0.0 
0.0 

0.653 

0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 

0.515 

0.0 
0.0 
0.0 
0.0 



Ind 

Trait 

Factor 



0.0 

0.0 

0.0 

0.0 

0.0 

0.587 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.821 

0.0 

0.0 

0.0 



fird 

Trait 

Factor 



0.0 
0,0 
0.0 
0.0 
0.0 
0.0 

0.716 

0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 

0.527 

0.0 
0.0 



EXB 

Trait 
Factor 



OiO 

0,0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.839 

0.0 * 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.579 

0.0 



ABg 

Trait 

Factor 



1.0 
-0.2T1 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 



1.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 



1.0 
0.273 
0.335 
0.162 

-o.ao 
O.U37 

0.202 

o.koy 



1.0 
0.32U 
0.057 
0.085 
0.227 
0.199 
-0.012 



1.0 
.O.U3 
0.127 
0.521 
0.U66 
0.i31 



1.0 
0.211 
-0.017 
O.OlU 
0.101 



0.097 -0.003 -0,022 -0.033 



1.0 
-0.176 
0.350 
0.263 
0.117 



1.0 
0.179 
0.320 
0.198 



1.0 

0.332 

0.071* 



1.0 
0.377 



TTOA 5P8: Matrix of Unlquwess /Error Variances (values are the diagonals of an 18 x 18 squve matrix) 

N Instructor Self Evaluations of 
U« Inthus Organ Group ladirld Breadth Exa«s Asignant Workld 
0.343 0.5X5 0.662 o.ii33 0.5XS 0.5i*2 0.U23 O.277 0.560 



0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.61*3 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.879 



Work 

Trait 

Factor 



Eta 1 Eta 2 Eta 3 Sta 1* Eta 5 Eta 6 Eta 7 Eta 8 Eta 9 Eta 10 Eta 11 



1.0 



Student fivmluatiottfl of 
Um ZathMM Organ Group Individ Breadth £xa»s 

0-U5 0.195 0.129 0.337 0.1*09 0.159 0.193 



Asignmeot 

0.1*ii7 



Workld 
0.236 
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II 
III 
lY 
T 

TI 



TAKLE T 
SuMary of Tested Models 



(no=?5SSi?5ieUoi'flcto?^"^«"^•»«thod factor 
correlations "-"'^ tactor) , no trait-aethod 

uthod correlations factor), no trait • 



2«?i:SS'£7i!^f i cor rela tod 

9 nn r. , • no «ait - aPthod correlations 

J.?i:ofsr:j'f?.\j!siif5a^c2j?sKffj;i^ 

' ""'UMd ,0 .,Uod factor, 

tue TB2TA --•rror/uniqucneSs-I aatri'x) " 



» traxr-aethod correlations 



S!li£3_ ChiSg/DF pel 

233.6 98 2.38 .977 

330.7 99 3.31 .9C2 
387.7 108 3.59 .952 
550.6 108 5.10 .933 

• 543.6 13a 3.99 . 951 

5««.3 135 0.03 .950 

1126.9 117 9.63 .858 

'•213.7 152 27.7 .561 

10564.8 171 61.? .000 

466.0 106 4.4 .9im 
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hi 

TABLE VT 

Trait and Method Variance Conponfjnts For Model I (the General l^odcl) 

Instructor Ratings ' student Ratings 





Trait 


Method 


Error 


Trait 


lethod 


ITrror 


LEAR Kill G/VALUE 


. 275 


. 444 


.343 


.517 


.486 


.115 




.407 


.060 


.515 


.533 


. 326 


. 1<)' 


ORGANIZATION 


. 274 


OOU 


• o c z 


C fi t\ 


.536 


. 123 


GROUP TTTTEFACT 


.536 


.030 


.433 


.420 


.230 


.337 


INDIVID n&PPFT 


.426 


.003 


.519^ 


.265 


.374 


.409 


BREADTH 


. 345 


.009 


.542 


.674 


.321 


. 159 


EXASISATIOKS 


.513 


.056 


.423 


. 278 


.687 


. 193 


ASSIGN}! EHTS 


.■704 


.024 


.277 


.335 


.378 


. a«7 


WRKLD/DIFFCL!"! 


.413 


.009 


.560 


.773 


.010 


.236 


nean Across All 
9 Evaluatioas 


.432 


.071 


.475 


.481 


.372 


.246 



22?ti^^*^i?2S^,*^°iP°"^"t^ were derived by squaring the Trait and 
^i.^^St'^E loadings fron. Table V, and using the unsquared val 
troo thp Theta Epsilon aatrix. • 
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Table VII \ 
Deconpositlon of Reproduced Correlations 
General Equation for the Decomposition of any Reproduced Correlation 
The Correlation Between any Measure (X> and any other Measure (Y) 

Trait Conponent Method Component Trait-Method Interaction Component 

R X y - <(TX)X(Tr)X(riTXTY)> ♦ <(MX)X(MY)X(RMXMY)> ♦ <(MX) (TY)X(RMXTY)> ♦ <(m )X( TX)X(RM'm)> 

Where: 

TX : The Trait Ix>mding (in Lambda Matrix) for X-Variable 
TY : The Trait Uading (in Lambda Matrix) for Y-Variable 

RTXTT : The Correlation (in PSI Matrix) Between Trait of X-Variable and Trait of Y-Variable 
MX : The Method-Factor Loading (in Lambda Matrix) of X-Variable 
MY : The Method-Factor Loading (in Lam'bda Matrix) of Y-Variable 

RMXMY : The Correlation (in pSI Matrix) Between Method of X-Variable and Trait of Y-Variable 
» RMXTY : The Correlation (in .'SI Matrix) Between Method of X-Variable and Trait of Y-Variable 

RTXMY : The Correlation (in PSI Matrix) Between Trait of X-Varidble and Method of Y-Variable 

Decoin»«itlon of a Convergent Validity Coefficient: Correlation Between Instructor Ratings of Breadth (X) and Students Rating x 

Breadth (X) 

- <(.587)X(.82l)X(l.0)> ♦ <(.093)X(.729)X(-.271)> ^ <(.093)(.82l)X(0.0)> ♦ <( .729)X( .587)X(0.0)> 

Decocpotitlon of a Heterotralt - Heteromethod Correlation: Correlation Between Instructor Ratings of Organization (X) and Student 

Ratings of Breadth (Y) 

• <(.523)X(.821)X(.I»66)> ♦ <(-.067)X(.567)X(- 271)> ^ <(-.067)( .82l)x(0.0)> ♦ <( .56'')X( .5: 3)X(0.0)> 

Decoiqwtition of Tvo Honotralt - Heteromethod Correlations: Correlations Between Instructor Ratings of Organization (X) and 

Instructor Ratings of Breadth (Y) 

• <(.523)X(.587)X(.1»66)> ♦ <(-.067)X( .567)X(l.O)> ♦ <(-.067)( .587)X(0.0)> ♦ <( .56t)X( .523)x(0.0)> 
Correlation BetvMO Student Batings of Organisation (X) and Student Ratings of Breadth 

• <C.735)X(.82i)x(.»»66)> ♦ <(.729)X(.567)X(1.0)> ♦ <( .729) ( .82l)X(o.0)> ♦ <( .567)x( .735)X(0.0)> 
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iTl^ rrtSI^'" Table IV) all the trait-method Interactions were fixed to be zero. 

L°t-"rtrtS!^ StSoiTir^i': '--"'''^ ' — ^-^^^^^ -relation 



u 

O 

^:) p. 



